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I.  Introduction 

In  this  paper  we  report  on  some  of  our  recent 
work  on  time  scale  decomposition  and  aggregation  of 
large-scale  linear  systems  containing  weak  coupl¬ 
ings  and  finite-state  Markov  processes  (FSMP's)  con¬ 
taining  rare  transitions.  Cur  work  builds  or.  that 
of  Coderch,  et.  al.  {1,21.  The  focus  of  the  work 
in  [11  is  on  the  asymptotic  approximation  of  the 
linear  system 

x ( t)  •  A(e)x(t) .  (1.1) 

To  set  the  stage  for  our  work,  consider  a  second 
system 

l(t)  -  B(e  )z(t) .  (1.2) 

Me  say  that  (1.2)  is  asymptotically  equivalent  to 
(1.1)  if 


lim  sup 
e-0  t»0 


,  A(e)t  .  a8(e)t|| 


e-o  "  " 

The  focus  of  [11  is  on  the  construction  of  a  system 
as  in  (1.2)  where 

B(e)  -  T  diag(AQ,  eAj^ . e^jT-1  (1.4) 

so  that  A  captures  the  order  1  time  scale,  cA^, 
the  0(l/eT  scale,  etc.  What  is  accomplished  in 
[11  is  the  development  of  a  procedure  which  deter¬ 
mines  i_f  such  a  complete  time  scale  decomposition 
is  possible  and,  if  so,  computes  the  A..  In  our 
opinion,  this  is  a  very  important  result,  but  [11 
left  much  to  do,  for  example  in  "peeling  back"  the 
mathematics  of  [11  to  allow  us  to  obtain  a  far 
clearer  and  deeper  understanding  of  time  scale  de¬ 
compositions  . 

In  [3]  we  presented  some  of  our  first  results 
on  an  algebraic  approach  to  the  problem  of  time 
scale  decomposition  of  (1.1)  based  on  viewing  A ( e ) 
as  a  matrix  over  the  ring  w  of  functions  of  e  that 
are  analytic  at  e»0.  This  work  allowed  us  to  re¬ 
late  the  general  result  of  [11  to  earlier  work  as 
in  (4 |  on  special  cases  for  which  the  form  of  the 
time  scale  decomposition  is  intuitively  clear. 
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Grant  AFOSR-82-0258  and  in  part  by  the  Army  Re¬ 
search  Office  under  Grant  DAAG  29-84-K-0005. 
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Furthermore,  by  making  clear  the  cole  of  invariant 
factors  in  time  scale  decompositions,  we  were  able 
to  formulate  and  solve  a  "time  scale  control"  pro¬ 
blem.  This  algebraic  approach  also  allows  us  to 
consider  and  solve  several  other  important  problems 
which  we  report  on  in  Section  II.  In  particular, 
we  have  been  able  to  obtain  a  complete  characteri¬ 
zation  of  the  relationship  between  the  eigenvalues 
of  Ate),  its  invariant  factors,  and  a  condition 
introduced  in  [1]  in  a  complicated  way  but  to  which 
we  can  now  give  a  far  clearer  interpretation.  This 
characterization  is  then  used  to  develop  (a)  a  pro¬ 
cedure  for  computing  the  invariant  factors  (and 
thus  determining  the  number  of  modes  at  each  time 
scale)  from  the  ged's  of  principal  minors  of  A(e)  , 
and  (b)  a  method  for  scaling  the  system  (1.1)  when 
it  does  not  have  a  uniform  time  scale  approximation 
to  obtain  a  system  that  does. 

The  work  in  [2]  applies  the  method  of  [11  to 
FSMP's  with  rare  transitions  (as  parametrized  by 
C> ■  In  such  a  case  x(t)  in  (1.1)  is  a  vector  of 
state  probabilities  and  A(c)  is  a  stochastic  matrix 
(offdiagonal  elements  >_  0  and  column  sums  10). 

What  is  done  in  [21  is  to  interpret  the  results  of 
[1]  as  defining  a  succession  of  stochastically  dis¬ 
continuous  processes  representing  the  evolution  of 
x  at  successively  slower  time  scales  (t,  t 
so  that  at  each  stage  transitions  that  occur  at  a 
faster  rate  appear  to  occur  instantaneously.  This 
led  naturally  to  an  aggregation  at  each  stage  that 
had  the  effect  of  removing  these  discontinuous 
transitions.  While  this  is  an  extremely  important 
result,  it  does  have  some  drawbacks.  In  particu¬ 
lar  ,  the  direct  application  of  the  results  of  [1! 
involves  a  procedure  whose  probabilistic  interpre¬ 
tation  is,  at  best,  obscure.  Furthermore,  the 
computational  feasibility  of  this  approach  is  dubi¬ 
ous.  This  is  in  marked  contrast  to  other  work  in 
this  area,  such  as  [81,  in  which  an  intuitively 
appealing  approach  to  aggregation  is  described  for 
the  special  class  of  models  devoid  of  transient 
states  at  any  time  scale:  to  obtain  an  aggregate 
description  of  the  FSMP  at  the  slower  time  scale, 
we  lump  together  the  states  in  each  separate  er- 
godic  class  at  the  faster  time  scale  (i.e.  with 
e  ■  0)  and  compute  an  average  transition  rate  be¬ 
tween  these  ergodic  classes  to  be  used  to  describe 
evolution  at  the  slower  time  scale . 

While  the  intuition  provided  in  this  method 
is  aesireable,  the  limitations  of  methods  as'  in 
[8)  are  both  the  absence  of  proofs  of  uniform 
asymptotic  equivalence  of  the  approximations  pro¬ 
duced  and  their  inability  to  handle  transient 
states.  In  our  recent  work,  described  in  Section 
III,  we  have  been  working  to  bridge  the  gap 
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between  the  methods  of  [1]  and  '9].  In  particu¬ 
lar,  we  have  davalopad  an  understanding  of  the  rola 
of  transiane  states  and,  in  particular,  of  what  we 
call  'splitting  transient  states.*  This  under¬ 
standing  has  allowed  us  to  formulate  an  approach 
to  aggregation  and  asymptotic  approximation  that 
in  essence  follows  the  procedure  of  [1]  but  does  so 
by  modifying  the  FSMP  at  each  stage  so  that  the 
computations  involved  are  essentially  those  of  the 
extension  of  the  methods  of  [8]  to  allow  for  trans¬ 
ient  (but  non-splitting)  states.  The  procedure  we 
describe  and  illustrate  in  Section  III  is  computa¬ 
tionally  feasible  for  the  analysis  of  very  complex 
systems. 

II.  Algebraic  Methods  for  Time  Scale  Analysis 
The  perturbed  matrix  A(e)  from  (1.1)  can  be 
expressed  in  the  Smith-decomposed  form  A(e)  »  P(e) 
D(c)Q(c),  where  P(c)  and  Q(c)  are  unimodular  (i.e. 

| P (0)  )^0  and  |g<0)  |><0)  and 

J1  jm 

D(e)  •  block  diagonal  [e  l  , ...  ,t  I  ], 

n,  n 

1  m 


The  diagonal  elements  of  D(c)  are  the  invariant 
factors  of  A(e).  since  P(e)  and  its  inverse  are 
well  defined  in  a  neighborhood  of  c  »  0,  we  can 
make  the  change  of  variables  x(t)  «  P(e)z(t)  in 
(1.1),  which  results  in  a  description  that  we  call 
explicit  form; 

i(t)  -  0(e)Q(c)P(e)z(t)  -  0(e) A(e) z (t) . 

A(c)  unimodular.  (2.1) 

What  we  term  a  reduced  explicit  form  for  (1.1)  is 
then  obtained  by  replacing  the  unimodular  matrix 
A(e)»Q(e)P(e)  by  the  constant  matrix  A(0)*Q(0)P(0) , 
which  we  shall  from  now  on  simply  denote  by  A,  to 
form  the  system 


U”'*Alm 
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(2.2) 

The  partitioning  in  (2.2)  is  that  induced  bv  the 
block  sizes  in  0(e). 

A  key  role  in  our  theorems  is  played  by  a  set 
of  matrices  derived  from  A  in  (2.2).  To  obtain 
this  set,  we  first  write 

A(1)  -  A  and  K,  -  A..  .  (2.3a) 


Now.  let  A  d^not4  th«  Schur  complement  of  in 
A(i  and  A  denote  the  n^xn.  leading  principal 
submatrix  of  this  Schur  complement,  so  that 

A22  "  A22  "  A21<A11)_1a12  *  (2.3b) 


Thus  A.*  is  defined  iff  A.,  is  nonsinoular.  Contin¬ 
uing  this,  we  define  A.  . ,  i  •  1  to  m,  as  the 
n.xn.  leading  pnneipai  submatrix  of  the  Schur  com¬ 
plement,  A(i',  of  A.,  ,  ,  in  A(i"^*;  again,  it  is 
defined  iff  A^^  '  is  nonsingular. 

A.  Connections  to  Results  of  (1,51 

It  was  shown  in  [5]  that  the  necessary  and 
sufficient  condition  for  (1.1)  to  have  a  complete 
time-scale  decomposition  is  that  A(e)  satisfy  a 
so-called  "multiple  semi-stability"  (MSST)  condi¬ 


tion.  Another  condition  on  Ate)  that  will  be  of 
considerable  interest  to  us  here,  though  it  plays 
only  a  subsidiary  role  in  [3]  and  [1!,  is 
the  multiple  semi-simple  null  structure"  (MSSNS) 
condition. 

The  proofs  of  all  the  results  that  follow  are 
in  (or  may  be  readily  deduced  from)  [5,6] 

Theorem  1;  The  following  are  equivalent: 

(a)  A ( e )  in  (1.1)  satisfies  the  MSSNS  condition 
of  [11. 

(b)  The  orders  of  the  eigenvalues  of  A(e)  are 

identical  to  the  orders  of  its  invariant  fac¬ 
tors  .  __ 

(c)  D ( e) A  in  (2.2)  satisfies  MSSNS. 

(d)  A. . ,  i  •  1  to  m,  are  defined  and  nonsingu¬ 
lar.  11 

Although  our  focus  here  is  on  MSSNS,  we  note  that  the 
statements  in  Theorem  1  have  analogs  valid  for  MSST. 
If  we  replace  MSSNS  by  MSST  and  replace  nonsinoular 
by  Hurwitz  in  Id,  then  the  following  implications 
hold:  (a)  <—>  (c)<**»  (cl)  . 

Theorem  2;  If  Ate)  satisfied  MSSNS,  the  eigen¬ 
values  of  Ale)  and  D ( e) A  are  clustered  in  m 
groups,  with  those  in  the  k-th  group  lying 
3j{+l  3k. 

within  0(e  )  of  the  eigenvalues  of  e  A^. 

It  is  evident  from  Theorem  1(b)  and  Theorem  2 
that  the  MSSNS  condition  is  of  value  in  frequency - 
scale  approximation,  a  topic  traditionally  studied 
in  the  context  of  root  loci,  see  for  example  [71  . 

The  frequency  scales  of  (1.1)  are  eaual  to  its  in¬ 
variant  factors  precisely  under  the  MSSNS  condition, 
which  is  directly  checked  via  Theorem  1(d).  The 
eigenvalues  of  Ale)  at  the  different  frequency  scales 
are  then  approximated,  according  to  Theorem  2,  via 
the  Af^. 

In  general,  the  invariant  factors  of  A(e>  are 
obtained  from  the  ged  q(i)  of  all  ixi  minors  for 
each  i,  while  the  eigenvalues  of  Ate)  are  determined 
solely  by  its  principal  minors.  By  (b)  of  Theorem 
1,  it  must  be  true  that,  when  Ale)  satisfies  MSSNS, 
the  invariant  facotrs  are  also  determined  by  the 
principal  minors  alone  —  in  fact,  by  the  acd's  of 
the  ixi  principal  minors  for  each  i ,  as  described  in 
the  next  theorem.  For  the  statement  of  the  theorem, 
denote  the  orders  of  the  acd's  of  the  ixi  principal 
minors  by  p(i),  i  »  1  to  n,  and  define  p < 0 )  •  7. 

Now  let  b(i)  be  the  slopes  of  the  line  segments 
forming  the  lower  (boundary  of  the  convex)  hull  m 
the  graph  of  p(i)  versus  i. 

Theorem  3 1  (a)  If  A(e)  satisfies  MSSNS,  then 

the  orders  of  its  invariant  factors  are  eoual 
to  b(i),  i  •  1  to  n. 

(b)  If  A!e)  has  invariant  factor  orders  equal 
to  the  b(i)  of  the  explicit  form,  then  Ate) 
satisfies  MSSNS. 

Such  "Newton  polygon"  constructions  are  to  be  ex¬ 
pected  in  the  context  of  the  present  problem,  cf. 

[7],  However,  we  have  not  encountered  a  statement 
as  simple  as  that  of  the  above  theorem  in  the  liter¬ 
ature.  The  result  will  be  useful  in  the  discussion 
of  scaling  (II. B) .  Another  use  is  illustrated  in 
example  1. 

Example  1:  Suppose  n»4,  and  suppose  p(l)*3,  o(2)»2, 
p(3)»2  and  p(4)»3.  Then  Figure  2.1  shows  that  the 
b(i)  are  2/3,  2/3,  2/3  and  1.  Since  invariant  fac¬ 
tors  of  A(e)  cannot  be  of  fractional  order,  the  only 
possible  conclusion  is  that  Ale)  does  not  satisfy 
MSSNS. 


s'.1. 


B.  Scaling 

A  matrix  Ate)  that  docs  not  satisfy  MSSNS  — 
and  that  therefore  has  eigenvalue  orders  different 
from  invariant  factor  orders,  see  Theorem  1(b)  — 
can  often  be  transformed  by  non-unimodular  simi¬ 
larity  transformations  to  a  matrix  that  does  sat¬ 
isfy  MSSNS.  An  important  reason  for  trying  to  in¬ 
duce  MSSNS  like  this  is  to  enable  the  application 
of  decomposition  results  such  as  Theorem  2  to  es¬ 
timating  the  natural  frequencies  of  (1.1). 

We  restrict  ourselves  to  e-dependent  sealing 
of  variables,  i.e.  to  non-unimodular  diagonal  simi- 
larity  transformations.  This  enables  us  to  build 
directly  on  Theorem  3(b),  because  such  transfor¬ 
mations  leave  both  eigenvalues  and  principal  minors 
unchanged,  while  they  still  permit  some  modifi¬ 
cation  of  invariant  factors.  Before  stating  the 
procedure  in  a  general  way,  we  consider  an  example. 
Example  2:  A(e)  -f-e  ll-  fl-e  lip.  0  "]  f-e  1+el 

[o  -cj  L-elJlo  e2JU  li 

The  explicit  form  and  reduced  explicit  form  of  A(e) 
are  then 

-ii  it  :m:  x  o1]- 

Since  A  •  0,  it  is  evident  from  Theorem  1(d)  that 
Ate)  does  not  satisfy  MSSNS.  If  we  let  S(e)  - 
diagonal  [e.ll,  and  transform  the  explicit  form  to 
S(c)D(c)A(e)S~1(e)  the  resulting  matrix  satisfies 
both  MSSNS  and  MSST.  Below,  we  outline  a  systema¬ 
tic  approach  for  generating  appropriate  scaling 
matrices  S(e) . 

The  first  step  of  our  general  scaling  proce¬ 
dure,  again  driven  by  Theorem  3(b),  is  to  trans¬ 
form  A(t)  to  its  explicit  form,  A  (e) >D(e) Ate) , 
see  (2.1).  The  second  steo  Chen  Involves  marking 
what  we  term  a  skeleton  in  the  explicit  form:  pre¬ 
cisely  one  element  from  each  row  and  column  of 
A  (e),  with  the  additional  constraint  that  no  other 
element  in  a  row  have  lower  order  (in  c)  than  the 
skeleton  element.  Since  A(0)  is  nonsingular,  the 
skeleton  element  in  the  i-th  row  has  order  equal 
co  the  order  of  the  i-th  entry  of  0(c).  (The  choice 
of  skeleton  may  not  be  unique,  but  see  Remark  1 
below. ) 

Now  identify  the  skeleton  above  with  the  nxn 
permutation  matrix  that  has  l's  at  the  locations 
of  the  skeleton  elements  and  0's  elsewhere.  Recall 
that  any  permutation  can  be  uniquely  expressed  as 
a  product  of  disjoint  cycles.  It  follows  ehae, 
perhaps  after  some  re-ordering  of  the  variables 
associated  with  our  system,  the  elements  of  the 
skeleton  can  be  brought  to  the  positions  occupied 
by  l's  in  a  block  diagonal  canonical  eirculant 
matrix,  whose  diagonal  blocks  take  Che  form: 


010  ....  0 
0  0  1  0  ..  0 


(or  simply  1  for  a  scalar  block) 


we  shall  restrict  ourselves  here  to  the  case  of 
only  a  single  block:  the  extension  of  the  follow¬ 
ing  results  to  the  multiple  block  case  is  described 
in  (6).  Note  that  the  required  re-ordering  of 
variables  corresponds  to  using  a  permutation  matrix 
for  similarity  transformation  of  the  explicit  form, 


and  the  result  is  still  in  explicit  form.  This  re¬ 
ordering  of  variables  is  the  third  steo  of  our  pro¬ 
cedure  . 

The  following  description  now  takes  A  (e)  to 
have  a  skeleton  corresponding  to  a  single  canoni¬ 
cal  eirculant  block,  we  denote  the  order  of  the 
skeleton  element  in  the  i-th  row  by  a(i)  —  note 
that  these  are  just  the  orders  of  the  invariant 
factors.  We  make  three  further  assumptions.  The 
first  of  these  is  that  the  orders  of  the  diagonal 
entries  in  the  matrix  are  in  nondecreasing  order; 
this  assumptions  is  also  lifted  in  (6).  We  have, 
however,  been  unable  to  relax  the  remaining  two 
assumptions: 

Assumption  1:  b(i)  >_  a(j)  for  i,j  ■  1  to  n-1.  To 
visualize  what  the  assumption  states,  plot  both 
p(i)  and  q(i)  versus  i,  as  in  Figure  2.2.  Then  the 
slopes  of  the  (solid)  line  segments  making  up  the 
lower  hull  of  the  p(i)  curve  are  assumed  to  be  not 
less  Chan  the  slopes  of  chose  making  up  ehe  q(i) 
curve  (the  dotted  lines),  except  at  the  last  seep 
(from  n-1  to  n) . 

Assumption  2:  In  any  principal  submatrix  of  A  (e), 
the  order  of  any  term  formed  by  taking  the  proSuct 
of  precisely  one  element  from  every  row  and  column 
of  the  submatrix  is  not  less  than  the  order  of  the 
corresponding  principal  minor. 

With  all  the  above  assumptions,  the  following 
scaling  can  be  shown  to  transform  the  matrix  to 
one  that  satisfies  MSSNS: 

S1  *2  8  _i 

S(e)  •  diagonal  U  ,  e  ,  ...,£  n  ,  1),  (2.5a) 


n  *  8i*l  *  b(i)  *  a(i)l  sn  ■  °-  (2-5b) 

The  arguments,  even  for  the  special  case  we  are 
considering,  are  rather  intricate,  and  are  pre¬ 
sented  in  [6] .  They  show  that,  under  Assumptions 
1  and  2,  Che  above  scaling  produces  invariant  fac¬ 
tor  orders  equal  co  b(i). 

The  following  example  illustrates  the  process. 
Example  3:  Suppose 


e  ,  where  the  circled 
_7  elements  constitute 
a  skeleton. 


It  is  easv  to  see  that  A(-:)  is  already  in  explicit 
form.  Similarity  transformation  by  a  permutation 
matrix,  corresponding  to  a  re-ordering  of  vari¬ 
ables,  brings  it  to  the  form 


It  is  evident  that  a(i)»l,  a(2)»l,  a(3)«0,  a(4)»6, 
while  some  computation  shows  that  b(i)»2  for  i«l 
to  4.  This  information  can  be  visualized  via 
Figure  2.3.  From  (2.5b)  we  have  s*2,  *  »3  and 
s,»4.  Similarity  transforming  A  ?:)  by  s(e)  de¬ 
fined  in  (2.5a) ,  we  get  the  matrix 
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It  is  tuy  Co  chock  that  Chis  matrix  satisfies 
MSS NS. 

Remark  1;  While  the  validity  of  Assumptions  1  and 
2  is  independent  of  which  particular  skeleton  is 
chosen  (when  the  choice  is  not  unique),  it  may  be 
that  one  choice  leads  to  a  simpler  procedure  than 
another  choice.  . 

Remark  2»  Our  scaling  procedure  involves  (in  S 
(c ) )  and  may  produce  matrices  whose  entries  are 
outside  the  ring  w.  However,  by  changing  the  time 
scale  (i.e.  redefining  c) ,  one  can  always  bring  the 
result  of  the  scaling  to  a  form  that  our  theorems 
apply  to. 

Remark  3i  See  (61  for  application  of  this  scaling 
procedure  to  cases  treated  in  ( 10 1 . 

Remark  4:  Other  scalings  are  possible,  even  when 
our  assumptions  are  violated,  and  further  work  in 
this  direction  will  be  worthwhile. 

III.  Aggregation  and  Time-scale  Decomposition  of 
Finite-state  Markov  Processes 
In  this  section  we  describe  our  recent  work 
on  aggregation  of  finite-state  Harkov  processes. 

In  order  to  provide  some  perspective  on  the  key 
ideas  underlying  our  approach,  we  begin  by  review¬ 
ing  the  case  of  "nearly  completely  decomposable 
systems"  and  contrasting  what  the  approaches  of 
[1]  and  [9]  have  to  say  in  this  case. 

A.  Nearly  Completely  Decomposable  Systems 

Consider  a  FSMP  whose  probabilistic  evolution 
is  described  by  (1.1)  with  A(e)  •  A  *  £8,  where  A 
describes  a  FSMP  with  several  ergodic  classes  and 
no  transient  states.  The  precise  structure  of  the 
transitions  between  these  classes  for  e  >  0  is 
specified  by  the  matrix  B.  Such  a  process  can  be 
shown  to  have  only  two  fundamental  time  scales 
(t  and  t/C)  due  to  the  combination  of  the  irreduci¬ 
ble  structure  of  A  and  the  restriction  to  linear 
perturbations  of  the  form  S3. 

If  we  follow  the  approach  of  (11,  [21,  the 
"slow"  dynamics  of  the  FSMP  are  captured  by  the 
generator  B (s) *P (c) A(e) P (c )  where  P(e)  is  the 
oblique  projection  onto  the  eigenspace  of  AC-:)  of 
all  the  o(l)  eigenvalues  along  the  space  of  0(1) 
eigenvalues.  More  precisely,  the  fast  dynamics, 
represented  by  A(0)  (which  capture  the  behavior  of 
(1.1)  over  intervals  of  the  form  (0,T  ),  T  '  »  ) 
and  the  slow  dynamics  C0(0),  where 


0(0) 


(3.1) 


P(e)A(e) P(e) 

l 

(which  capture  (1.1)  over  Intervals  of  the  form 
T2 

(■^r,  •) ,  Tj  *  0)  together  provide  a  uniform  approx¬ 


imation  to  the  original  FSMP. 

An  issue  here  is  whether  it  is  really  nece¬ 
ssary  to  calculate  P(e).  In  particular,  inter¬ 
pretation  of  the  reduction  performed  by  Courtoia 
(81  on  nearly  completely  decomposable  chains  is 
that  C(c)»P(0)A(c)P(0)«  P(0)BP(0)  generates  a 
Harkov  process  whose  probability  transition  func¬ 
tion  is  an  approximation  of  the  original  f 'unction 
with  bounded  error  on  an  interval  t>TI;l  for 
some  T(e)  that  grows  without  bound  as  s-0.  what 


we  conjecture  is  a  good  bit  stronger.  Specifically 
we  claim  that  A(0)  and  P(0)BP(0)  together  provide 
a  uniform  approximation  to  the  original  FSMP  that 
is  that  A(e)  »  A+cB  can  be  replaced  by  A(e)» 
A+eP(0)BP(0) .  Note  that  in  this  approach  all  that 
is  required  is  P(0>,  which  is  nothing  more  than  the 
ergodic  projection  matrix  associated  with  A(0)»A. 

An  example  illustrating  this  is  shown  in  Figure 
3.1.  The  original  process  is  shown  in  (a),  for 
this  process. 
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The  process  corresponding  to  A(e)  is  shown  in  (b) . 
While  this  process  may  appear  to  be  more  complex, 
what  we  have  in  fact  done  is  to  maintain  the 
equilibrium  of  the  fast  dynamics  after  rare  transi¬ 
tions.  Further,  since  A  and  P(0)BP(0)  commute 


A ( e) t  At  P (0) BP (0) ct 

e  >ee 


As  in  [21,  we  can  write  P(0)  as 


(3.3) 
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00'Uuv 
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(3.4) 


where  the  columns  of  U  represent  the  two  possible 
ergodic  probability  vectors  of  A,  and  V  lumps  the 
states  in  each  ergodic  class. 

Combining  (3.3)  and  (3.4)  we  have  that 


A(e)  t 


At  9  it,, 

e  Ue  V 


(3.3) 


where 

B’ 


(3.6) 


Figure  3.1(c)  provides  the  interpretation  of  (3.5): 
the  matrix  B1  describes  the  evolution  of  the  slow, 
aggregated  process,  while  the  matrix  A  specifies 
the  faster  evolution  within  either  of  the  two 
aggregate  classes. 

This  procedure  can  be  extended  in  a  straicht- 
forward  fashion  to  systems  exhibiting  multiple 
time  scales  when  there  are  no  transient  states  at 
the  first  time  scale.  In  such  cases  the  generator 
has  the  form  AU)«A*€B(s)  where  A  generates  no 
transient  states.  In  this  case,  we  again  conjec¬ 
ture  that  A ( c )  and  A ( -:) »A+cP (0) B ( e)P (0)  are  asymp¬ 
totically  equivalent.  Cne  can  then  proceed  as  in 
the  example  to  aggregate  B(e);  one  may  then  repeat 
this  procedure  several  times  as  in  the  procedures 
of  (2)  and  (9). 

A  natural  question  that  arises  at  this  point 
concerns  the  role  and  effect  of  transient  states. 
This  topic  is  taken  up  in  the  next  subsection. 

B.  Transient  States 

Though  indecomposable  structure  of  A(0)  is  a 
sufficient  condition  for  using  the  simplified  pro¬ 
cedure  described  above,  a  less  restrictive  condi¬ 
tion  is  available.  In  particular,  we  can  allow 
A (0)  to  possess  "non-splitting  transient  states  " 
i.e.  transient  states  that  may  have  0 ( 1) transitions 


into  more  Chan  one  A(0)  -ergodic  clast  but  do  not 
hava  diract  transitions  into  other  such  classas 
with  states  of  any  higher  order.  If  such  splitting 
transient  rates  are  present,  then  the  FSMP  nay 
exhibit  iop licit  time  scales  that  can't  be  captured 
directly  by  our  simplified  procedure.  As  an 
example,  consider  the  FSMP  shown  in  Figure  3.2. 
Though  this  chain  has  only  linear  perturbation 
terms,  the  eigenvalues  are  0,  0(1),  and  0(e).  The 
generator  P(0)A(e)P(0)  •  0,  obviously  does  not 
capture  the  t/£2  time  scale  behavior.  An  intuitive 
explanation  of  this  is  that  using  this  reduction 
process  implicitly  assumes  that  the  "fast* 
components  equilibrate  between  rare (0(e)  rate) 
transitions.  In  this  example,  the  t/£2  behavior 
is  associated  with  a  sequence  of  two  consecutive 
rare  transitions  (state  1  to  2  followed  by  2  to  3) . 
Beginning  in  state  2,  there  is  an  0(1)  probability 
of  entering  state  1  next  and  an  0(e)  probability  of 
entering  3.  Effectively,  this  0(e)  probability  is 
lost  in  the  reduction  procedure. 

Transient  states  which  do  not  exhibit  such 
0(e)  probabilities  of  entering  various  recurrent 
classes  do  not  cause  this  problem.  Consider  the 
related  FSMP  shown  in  Figure  3.3.  Both  states  2a 
and  2b  are  transient  at  e*0,  but  neither  of  them 
splits  as  state  2  does  in  the  previous  example. 

The  0(£2)  rate  is  explicit  and  P(0)A(e)P(0) 
successfully  captures  the  t/e2  behavior.  This 
chain  is  derived  from  the  first  by  "splitting" 
the  transient  state  2  into  the  nonsplitting 
transient  states  2a  and  2b,  depending  on  the 
first  recurrent  class  entered.  If  we  imagine 
having  an  observation  mechanism  for  this  process 
that  yields  the  value  1  or  3  if  the  FSMP  is  in 
state  1  or  3  respectively  and  the  value  2  if  the 
FSMP  is  in  2a  or  2b,  then  the  transition  rates 
between  these  observation  values  are  exactly  those 
given  in  Figure  3.2.  An  approximation  of  the 
process  shown  in  Figure  3. 3  can  than  be  used  to 
construct  an  approximation  of  the  original  process. 
C.  The  General  Procedure 

An  interative  procedure  can  be  derived  to 
construct  a  sequence  of  aggregate  perturbed 
generators  such  as  B'  (e)  at  each  successive  time 
scale.  The  steps  involved  in  computing  B1  (e)  from 
A(e)  consist  of  (i)  identifying  the  recurrent 
classes  and  transient  states  at  E«0 ,  (ii)  calcu¬ 
lating  the  invariant  probabilities  of  the 
recurrent  classes  at  (together  (i)  and  (ii) 
determine  P(0>).  (iii)  calculating  the  ^-dependence 
of  the  crapping  probabilities  starting  in  any 
state  of  the  transient  class  (so  that  we  may  split 
any  splitting  transient  class)  ,  and  finally  (iv) 
computing  the  aggregate  rates  B  1  (e)  from  these 
quantities.  From  A(0)  ,  b’  (0)  ,  etc.  ,  an 
approximation  can  be  calculated  which  we  conjecture 
is  the  same  uniform  approximation  derived  in  (2). 

This  procedure  can  further  be  simplified  by 
identifying  modifications  of  a  perturbed  chain 
which  preserve  its  time  scale  behavior.  For 
example,  it  is  conjectured  that  only  the  leading 
order  term  in  e  of  any  transition  rate  affects 
asymptotic  behavior.  Also,  if  there  is  an 
indirect  sequence  of  0(1)  rate  transitions  from 
one  state  to  another,  then  any  direct  0(e)  rate 
between  these  states  can  be  safely  "pruned." 


Though  the  reduction  algorithm  as  outlined 
above  has  produced  the  same  uniform  approximation 
as  the  methods  of  (2)  and  [9]  in  all  the  examples 
we  have  considered ,  the  proof  that  this  la 
necessarily  true  has  not  yet  been  established. 
Explicitly,  two  fundamental  conjectures  form  the 
basis  of  the  result. 

1)  A  (e)  »A  (0) +EB  (e) :  Markov  generator  with  A(0) 

irreducible 

F(E)-P(0)A(E)P(0)  -  P  (0)B(E)P{0) 
conjecture 

lim  sup  |[  exp  (A  (£)  t)-exp  (F  ( £)  t)  [|  »  0  (3-7) 

£+0  eti<5>0 

lim  sup  |exp(A(£)t)-exp(A(0)t)  exp(Ble)t) ,!  -  0 
E+0  tiO  (3-3) 

2)  Conjecture  1  is  also  true  if  A(0)  has  no 
splitting  transient  states,  that  is  under 
the  following  condition.  Let  T  denote  the 
set  of  transient  states  of  A(0)  and  let 

R. , . . . ,R  denote  its  ergodic  classes.  Let 
pit)  denote  the  sample  path  of  the  FSM? 

(with  e  included) .  Then  for  all  xQ£T  and 
all  i-1,. . .  ,m 

Pr(P(t1)ERi[t1-inf(t:p(t:<T)  ,p(0)-x0)»  0  orO(l) 

(3-9) 

Proving  this  second  result  allows  us  to  consider 

arbitrary  generators  A(£)  by  conceptually  splitting 

the  transient  class  T  into  m  copies,  T  . 

associated  with  the  classes  R.,...,R  .  ' 

1  m 
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